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Abstract 


How  the  dispersiveness  of  the  nixing  distribution  carries  over  to  the 
nixed  node!  is  qualified  in  terns  of  generalized  convex  functions.  These 
ideas  are  extensions  of  those  in  Shaked  (1980)  and  Schweder  (1982).  A 
representation  akin  to  the  one  for  dilations  is  also  given  for  balayages 
defined  in  terns  of  these  generalized  convex  functions. 


1.  Introduction.  In  certain  statistical  problems ,  one  typically  has  in 
adnd  a  family  (F^t  6  a  0}  of  models  (distributions)  for  the  observations. 

As  sometimes  happens,  though,  the  observed  data  may  be  "more  dispersed”  than 
might  be  expected  of  the  above  family.  This  could  suggest  that  a  "mixed 
model"  may  be  a  more  appropriate  fit  since  mixing  introduces  more  dispersion 
intc  the  model. 

In  this  paper  we  qualify  just  how  "dispersiveness"  in  the  mixing  distribu¬ 
tion  carries  over  to  the  mixed  model  for  certain  types  of  models.  This  extends 
the  work  of  Shaked  (1980)  and  of  Schweder  (1982).  More  specifically  (and 
ignoring  obvious  measure  theoretic  technicalities),  for  a  mixing  distribution 
X  on  e,  let  -  J  Fq  dX  denote  the  mixed  model.  When  the  models, 

p0,  0  c  0  ,  arise  from  a  family  of  densities  {f0:  0  e  0}  with  respect  to  a 
e- finite  measure  m,  f^  -  J  fQ  dX  will  denote  the  mixed  density  with  respect 
to  m.  Note  that  fQ  *  f&  when  is  the  mixing  distribution  degenerate 
.t  9.  9 


Shaked  (1980)  investigated  two  types  of  dispersivenes6  for  one  parameter 
exponential  families.  One  type  was  in  terms  of  sign  changes  and  the  other  in 
terms  of  dilations.  (A  distribution  G  is  said  to  be  a  dilation  of  another 


d 


distribution  F,  written  G  >  F,  if  J  cdF  $  J  cdG  for  all  convex  c. )  Shaked 
showed  that  f ^  -  f  *  has  two  sign  changes  and  the  order  is  +,  +  when 

satisfies  the  first  "moment"  condition  J  u(0)dX(0)  -  u(0*)  where  u(0)  - 

d 


X  xfQ(x)dm(x).  He  also  showed  that  if 
d 

then  F  >  r.. 
r  x 


u(9)  is  linear  in  0  and  y  >  X, 


-  .  .J 

•  -i 
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Schweder  (1982)  further  investigated  this  second  type  of  dispersiveness 
d  d 

and  showed  that  F  >  Fv  whenever  y  >  X  if  and  only  if  the  family 
Y  a 

{F0:  0  e  8)  is  convex ly  parameterized.  That  is,  c(9)  -  J  c(x)dFQ(x)  is  convex 
whenever  c  is  convex. 

The  above  two  types  of  dispersiveness  might  be  considered  first  order 
notions  of  dispersiveness.  The  sign  change  since  ‘-fv  is  compared  to  f  . 

x  e 

which  arises  from  the  degenerating  mixing  distribution  &  A;  the  dilation 

d  9  d 

since  y  >  X  if  and  only  if  y(*)  ■  J  P(*|0)dX(0),  where  P(.|0)  > 

is  a  probability  distribution  for  each  9.  (See  Strassen,  1965, 

Theorems  2  and  8.) 

Here  we  are  interested  in  higher  order  (k-order,  k  >  1)  notions  of  dis¬ 
persiveness.  These  higher  order  notions  involve  Tchebycheff  systems 
(T-systems)  of  functions  U  -  (u  # . . . »u2k— 1^  anc*  ^-convex  functions  which 
are  defined  in  terms  of  U. 

In  Section  2,  a  rudementary  account  on  T-systems  and  U-convexity  is  given 
and  a  simple  characterization  of  U-convexity  is  proved  (Theorem  2.1).  Very 
thorough  accounts  on  T-systems  and  generalized  convexity  can  be  found  in 
Karlin  and  Studden  (1966)  and  in  Karlin  (1968).  A  palatable  introduction  to 
generalized  convexity  can  be  found  in  Roberts  and  Varberg  (1973). 

M 

In  Section  3,  U-U  convexly  parameterized  families  are  defined  for 
T-systems  u  and  U.  It  is  shown  that  {FQ:  9  e  9)  is  U-U  convexly  para- 

«*•  m 0 

U  U  u  u 

mete ri zed  if  and  only  if  Fv  >  whenever  y  >  X  where  >  and  >  are 

partial  orderings  defined  in  terms  of  U  and  U  (Theorem  3.1).  In  addition  it 

is  shown  that  under  the  (equivalent)  moment  conditions 
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X  UjdX  -  /  UjdX^  j«0, . . . ,2k-l 

or 

/  UjdP^  ■  X  UjdF^  j“0, . . . ,2k-l 

has  2k  sign  changes  and  the  order  is  idler*  X  is 

discrete  with  k  mass  points  (Theorems  3.2,  3.3  and  3.4).  The  latter  result 
is  useful  in  determining  "if  you've  gone  for  enough"  when  fitting  a  mixed 
model  using  a  method  of  moments  approach. 

Finally,  in  Section  4,  a  necessary  and  sufficient  condition  is  given  to 
show  when  a  probability  measure  r  has  the  representation 

k 

r(*)  -  X  P(  • |x1# ... #xk)  H  dX(xi) 

o 

where  P(* |xlf...,xk)  >  Fx 

and  Fx  is  the  empirical  distribution  function  for  the  sample 
x  -  (x^,...^)  (Theorem  4.1). 
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3.  U-Ccnvexity.  Fundamental  to  the  notion  of  u-canvexity  is  the  definition 
of  a  Tchebycheff  system.  (Throughout  this  section,  X  -  (x^:  i-0,l,...,n+l}, 
x0  <  x  <...<  x^. ) 

Definition.  A  family  of  functions  U  -  {u^:  i-0,l,...,n)  defined  on  x 
is  said  to  be  a  Tchebycheff  system  (T-system)  on  X  if  the  determinant 


u(X')  ■  u(x',...,x^) 


VV  •••  U0(XA) 

u^x')  ...  u^(x^) 


W  —  Vxn' 


is  positive  whenever  X'  -  {x£  <...<  x^J  c  x.  For  a  set  Y  of  cardinality 
greater  than  n+x,  the  family  U  is  said  to  be  a  T-system  on  Y  if  I)  is  a 
T-system  for  each  X  c  Y. 

Definition.  Let  O  -  {u^  i-0,...,n}  be  a  T-system  on  X.  A  function  f 
is  said  to  be  P-convex  on  X  if  the  determinant 


uf(X)  ■ 


W  •••  Wi1 
W  ...  VW 


un(xo'  —  un(xn.l> 


f  (  Xq  )  ...  f  ( ) 


l  o. 


If  U  is  a  T-system  on  a  set  Y  of  cardinality  greater  than  n+1,  f  is  said 
to  be  U-convex  on  Y  if  f  is  U-convex  on  each  X  c  y.  A  function  f  is 


said  to  be  U-concave  if  -f  is  U-convex. 


Remark.  Note  that  a  polynomial  in  the  u's,  P(x)  -  AqUq(x)  +  AjU^fx) 
+...+  Ahun(x),  is  both  U-convex  and  U-concave. 
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The  next  theorem  gives  a  useful  characterization  of  U-convexity.  For  the 
usual  definition  of  convexity,  i.e.,  uQ  ■  1  and  u^tx)  ■  x,  it  corresponds  to 
the  midpoint  of  the  chord  between  two  points  on  the  graph  of  a  convex  function 
lying  above  the  function. 

For  this  characterization  we  need  the  following  notation.  Let  k  -  pj^j 

where  [x]  denotes  the  integer  part  of  x.  For  X  »  {xQ  <  x  <...<  xn+1), 
iet  tk  -  xn,  tJc_1  -  x ^_2,  tk_2  -  x^,...,  i.e.,  t^  -  xn  2^  for 

j  -  0,1,..., k-1.  For  t  -  (tlft2,...,tk),  let  Ffc  denote  both  the  proba¬ 
bility  distribution  and  probability  measure  which  places  mass  k-1  at  t4 . 
is  just  the  empirical  distribution  for  the  sample 


t^ , . . . ,  tk . 


Theorem  2.1.  A  function  f  is  U-convex  on  X  if  and  only  if 


(2.1)  S  fdFt  £  J  fdX 

for  each  finite  measure  X  with  support  contained  in  X  satisfying 

(2.2)  J  uj^Ft  “  f  uj^  *or 


Proof.  (■+) 


(2.3) 


AA  ■ 


If  f  is  not  U-convex,  then  u^(X)  <  0.  So, 


■  w 

•••  u0<xn+l’ 

V 

'  0  ' 

ux(x0) 

...  VW 
• 

• 

A1 

• 

• 

0 

• 

• 

w 

• 

"•  Un(xn+1) 

• 

*n 

• 

0 

f0(xo) 

•“  f(xn+l)  . 

An+1. 

-1 

has  a  solution  A. 

By  Cramer's  rule,  A^  ■  (-l)n+4+^  u(X^)  /  u£(X)  where  X^  -  X  -  (x^J. 
Since  u^(X)  <  0  <  u(X^),  A.,  alternates  in  sign  with  A^  <  0.  So, 
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c  -  max  {A  :  j-n,n-2,...  }  >  0. 

Let  «j  -  c  -  for  j»n,n-2,...  and  -  -A^  for  the  other  values  of  j. 
Then,  by  (2.3), 


and 


n+1  n+1 

0  -  E  u. (x.)A.  -  c  E  u.  (x.)  -  E  u.  (x.)ct. 

j-0  1  3  3  j«n,n-2, . . .  13  j-0  3  3 

n+1  n+1 

>-l  -  E  f(x.)A.  -  c  E  f(x.)  -  E  f(x.)a.. 

j-0  3  3  j-n,n-2, . ..  3  j-0  3  3 


Setting  X({x^))  -  ctj(kc)-*  >  0,  we  have  from  the  above  that  (2.2)  is 
satisfied  but  J  fdX  <  J  fF£.  This  proves  the  "if"  part  of  the  theorem. 


(+)  Now  let  f  be  U-convex.  If  u£(X)  -  0,  then  f  is  a  polynomial 
in  the  u's.  In  this  case,  from  (2.2)  equality  holds  in  (2.1).  Thus  to 
complete  the  proof,  we  ally  need  to  consider  when  uf (X)  >  0. 

Let  X  denote  a  measure  whose  support  is  contained  in  X  and  which 
satisfies  (2.2).  Let  ^  ■  A((xi))  -  X((x£))  -  Ft({xiJ)  and  c  -  I  fdA. 


Then  for  A  and  e  as  defined  in  (2.3),  AA  -  ce.  So,  from  Cramer's  rule, 
0  *  X({xn+1})  "  An+1  "  c(-l)2(n+2)u(Xn+1)  /  u£(X).  Since  uf(X)  >  0  and 
u(X^+£)  >  0,  it  follows  that  J  fdX  -  J  fdF£  -  c  >  0.  X 


U-convex  functions  can  be  used  to  define  a  measure  of  dispersiveness  for 
probability  measures.  This  is  needed  in  the  next  section  to  qualify  how 
dispersiveness  of  the  mixing  distribution  carries  over  to  the  mixed  model. 

The  terminology  is  from  Meyer  (1966). 


Definition.  Let  U  -  (Ug,...,un)  be  a  T-system  on  a  Borel  set  Y  c  r. 
Let  X  and  v  be  two  finite  measures  on  Y.  If  f  fdX  <  J  fdv  for  all 
integrable  U-convex  f,  then  v  is  called  a  balayage  of  X.  This  is  written 

u  u 

as  X  <  v  or  v  >  X.  Note  that  if  u-1  is  in  U,  then  J  dX  -  J  dv. 


7 


3.  jMI  Convexly  Parameterized  Families.  Let  {F0:  6  e  ©}  be  a  family  of 
distribution  functions  on  X  c  r  where  ©  c  r.  For  a  (integrable)  function  g, 
let  g(©)  -  /  g(x)dFQ(x). 

Definition.  Let  U  *  (uQ,...,un)  be  a  T-system  on  X  and  let 

M  MM  M  M 

II  »  {uQ,u^,...,unJ.  The  family  {FQ:  ©  e  ©}  is  said  to  be  U-U  convexly  para- 

M  M  M 

meter i zed  if  (1)  U  is  a  T-system  on  9,  and  (2)  c  is  U-convex  whenever  c 

is  U-convex.  ( Implicit  here  is  that  u.(x)  is  integrable  for  each  F_  and 

j  ® 

that  the  cardinalities  of  X  and  of  6  are  greater  than  n.) 

Example  1.  Let  F0  be  absolutely  continuous  with  respect  to  some 
e-finite  measure  m  on  X.  Let  f0  «  dFQ|dm.  If  fQ(x)  is  strictly  totally 
positive  (STP)  of  order  n+1,  (see  Karlin,  1968,  pages  11  and  12  for  the 
definition),  then  U  is  a  T-system  whenever  U  is  a  T-system.  This  follows 
from  the  basic  composition  formula  on  page  98  of  Karlin  (1968)  (see  also 
Theorem  3.2  on  page  284). 


Example  2.  The  one  parameter  exponential  family  with  density 
f0(x)  -  ex%(6)  is  STP  of  all  orders  tip  to  the  minimum  of  the  cardinalities 
of  6  and  X.  Such  a  family  includes  the  binomial  family,  the  Poisson,  the 
gamma  with  fixed  shape  parameter,  and  the  normal  with  fixed  variance.  See 
Karlin  (1968),  page  19,  for  details. 

Analogous  to  Schweder’s  (1982)  theorem  on  on  page  166  for  convexly  para¬ 
meterized  families,  the  following  theorem  points  out  the  connection  between 
U-U  convexly  parameterized  families  and  balayages. 


Theorem  3.1.  Let  U  »  (un,...,un)  be  a  T-syStem  for  which  u  is  a 


T-system  for  the  family  {F0:  6c©).  Then  {F0:  0  e  ©}  is  U-U  convexly 


u  u 

parameterized  if  and  only  if  F^  <  Fv  whenever  X  <  v. 


-«  —  — 


Proof.  (-»)  Let  0n+1  ■  {e0  <  <...<  6n+1)  c  e.  For  k  -  [^]  and 

j-0,1, . . . ,k-l,  let  t^_j  -  xn_2j*  Let  Pt  ^enote  t*ie  probability  distribution 

placing  mass  1 A  at  each  of  the  points  t^, . . . ,t^  and  let  X  be  any  other 


finite  measure  with  support  contained  in  @n  and  satisfying 


J  UjdX  -  J  UjdFt  j-0,...,n. 


u  u 

Then,  by  Theorem  2.1,  Ffc  <  X.  So,  Fp  <  F^.  Thus,  if  c  is  U-convex, 


J  cdFfc  -  JJ  c(x)dF0(x)dFt(6) 


J  c(x)dFF  (x)  <  J  c(x)dFx(x) 


-  JJ  c(x)dF0(x)dX(0)  -  J  cdX 


This  with  another  application  of  Theorem  2.1  yields  that  c  is  U-convex. 


(+•)  Let  X  <  v  and  let  c  be  U-convex.  Since  {F0:  0  e  0}  is  U-U 


convexly  parameterized,  c  is  U  convex.  So, 


J  cdFx  -  JJ  c(x)dFQ(x)dX(0) 


-  J  cdX  <  J  cd\> 


-  JJ  c(x)dF0(x)dv(0)  -  J  cdFv. 


Consequently,  F^  <  F^.  || 


when 


In  the  next  three  theorems  sign  change  results  are  given  for  f^  -  f^ 


(3.1)  J  UjdX  -  J  Uj  dX^  for  j-0,1, ... ,2k-l, 

and  X^  is  discrete  with  k  mass  points,  in  these  three  theorems  it  is 


v'Cn.'*  v'  - 


ft 


assumed  that,  for  each  0  e  d,  Fe  has  a  density  fe  with  respect  to  a 
©-finite  measure  m  which  is  Sn^+i  on  6  x  X,  X  the  support  of  m. 
Throughout  it  also  is  assumed  that,  for  each  j,  Uj  is  integrable  with 
respect  to  f^  and  f^. 

The  first  theorem  deals  with  the  classical  T-system 
i  2k-l 

U  -  {l,xA,...,x  }  and  generalizes  Theorem  1  of  Shaked  (198  ). 

Theorem  3.2.  Let  X  and  X^  be  two  mixing  distributions  satisfying  (3.1) 
1  2k-l 

for  O  ■  (l,x  ,...,x  )  where  X^  is  discrete  with  k  mass  points.  If 

m({f^  *  f^))  >  then  ^X  ~  ^  ^as  ^  sign  c*ian9es  on  *  and  order 

is 


Proof.  Note  that  from  the  definition  of  STP2k+^  it  is  implicit  in  the 
statement  of  the  theorem  that  both  6  and  X  are  of  cardinality  greater  than 


For  9  a  mass  point  of  X^,  let  s(0)  -  -1  if  X({0))  <  X^UG})  and 
let  s(G)  -  1  otherwise.  So  s(0)  has  at  most  2k  sign  changes. 

Let  fj  be  the  measure  given  by  dy  -  s(*)d(X-Xk).  Since 
A(x)  ■  fx(x)  -  f^(x)  -  J  s(0)f0(x)div(G)  and  fQ(x)  is  STP2k+1,  it  follows 

from  the  variation  diminishing  theorem  (Karlin,  1968,  page  233)  that  A(*)  can 
have  at  most  2k  sign  changes.  If  there  are  less  than  2k  sign  changes, 
say  1  sign  changes,  then  there  are  1  points  in  X,  x^,  <  x2  <...<  x^,  such 
that  A(x)A(y)  <  0  when  x  c  Ij  and  y  e  i^+1,  j-0,...,l-l  and  lQ  -  (-a,x1), 
Ix  -  (x1#x2),  ...»  1^  -  (x1#»).  Let  P(x)  -  (x-x1)(x-x2)...(x-x1).  Since 
P(x)  is  a  polynomial  of  degree  1  £  2k-l,  it  follows  from  (3.1)  that 


(3.2)  X  P(x)A(x)dm(x)  -  0. 


Since  P(x)A(x)  is  of  the  same  sign  and  P(x)  t  0  except  at  x^,...,x^, 
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(3.3)  A(x)  -  0  a.e.  (m]  on  X-Xq,  Xq  ■  {x^,...^}  from  (3.2). 

Thus,  for  -  A(xn)m((xn)), 

1  -j 

0  ■  £  X"  in  ,  j-0,...,l-l 
i>-l  n  n 

from  (3.1)  and  (3.3).  So  m  ■  0.  This  with  (3.3)  implies  that 
a.e.  [m]  which  contradicts  the  hypotheses  of  the  theorem.  g 


When  U  •  (Uq, is  a  Haar  system,  i.e.  (u0,...,Uj)  is  a 
T-system  for  j«0,l, . . . ,2k-l,  then  the  next  theorem  is  a  consequence  of 
Theorem  5.2  on  page  30  of  Karlin  and  Studden  (1966)  and  the  above  proof  with 
x^  replaced  by  u^(x). 

Theorem  3.3.  Assume  that  the  support  of  m,  X,  is  contained  in  a  finite 


interval  [a,b].  Let  U  -  {Uq,u1,...,u2j1_1I  be  a  Haar  system,  of  continuous 
functions  on  [a,b].  Let  X  and  X^  be  two  mixing  distributions  satisfying 


(3.1)  where  X^  is  discrete  with  k  mass  points. 


If  m({fx  ft  f^}>  >  0, 


then  f^  -  f^  has  2k  sign  changes  on  X  and  the  order  is  + 

For  the  next  theorem,  it  is  assumed  that  U  -  (Uq,u^,  . . .  ,u2Jl)  is  a 

Descartes  system,  i.e.,  (u.  ,u,  ,...,u4  )  is  a  T-system  for  each 

X1  x2  xm 


(i^»<* • *<»im)  ^  {0,...,2k}. 


Theorem  3.4.  Let  U  -  (u^, . . .  ,u2Jc}  be  a  Descartes  system  on  X.  Let  X 
and  X^  be  two  mixing  distributions  satisfying  (3.1)  where  X^  is  discrete 
with  k  mass  points.  If  ®((fx  #  f^l)  >  0,  then  fx  -  f^  has  2k  sign 

changes  on  X  and  the  order  is 


Proof.  As  in  the  first  part  of  the  proof  of  Theorem  3.2,  6  ■  fv  -  f. 

has  at  most  2k  sign  changes  by  the  variation  diminishing  theorem  (page  233 
of  Karlin,  1968). 
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Since  U  is  a  Descartes  system,  u,(x)  is  STPj^  on  {0,1,..., 2k)  x  X. 
If  A  has  less  than  2k  sign  changes,  say  1  <  2k-l  sign  changes,  another 
application  of  the  variation  diminishing  theorem  shows  that  g(  j)  ■ 

/  Uj(x)A(x)dm(x)  can  have  at  most  1  sign  changes  on  {0,1,..., 2k)  where 
zeroes  of  g  can  be  arbitrarily  assigned  either  sign.  But  this  leads  to 
a  contradiction  since  g(j)  -  0  for  j»0,...,2k-l.  H 

Remark.  These  Theorems  should  be  compared  with  Theorems  5.4  and  5.5  on 
pages  409  and  410  of  Karlin  and  Studden  (1966).  Note  that  there  u  is  an 
extended  complete  T-system  (or  what  might  be  called  an  extended  Haar  system) 
which  involves  assumptions  on  the  derivatives  of  the  u's. 


4,  A  Representation  Theorem.  For  k  a  fixed  positive  integer,  let 
U  -  {u0,u1,...#u2k_1)  be  a  T-system  of  continuous  functions  on  X-(a,b)  an 
open  interval.  When  k  >  1  it  shall  be  further  required  that  U  be  an 
extended  T-system,  i.e.,  in  addition  to  U  being  a  T-system,  each 
Uj  e  and,  for  1  distinct  values  of  the  x's  (1  -  l,...,2k-l), 


a<xft  ■  X-  ■  ...  ■  x  4  x  -  ■  x  »  ...  ■  x  4 . . • 4  x  « 
0  1  q1  qx+l  qx+2  q2  qi_1+l 


-  ...  -  x  <  b.qj 


»  2k-l,  the  following  determinants  are  all  positive: 
u*(x0'xl' *  *  * ,x2k-l)  “ 


<q.)  (ql"ql-l+1) 

u0^xq^  *  *  *u0  u0 ^ xq2  ^  * '  *u0^xq^  ***u0  *xq^ 

(q.)  («rt-l+1) 

Ul(xq1)  •**ul  (xq1)  Ul(xq2)  ***ul(xq1)  *,*ul  (xq1) 


(q.)  <«rt-l+1) 

U2k-l(xq1)  u2k-l(xq1,*“u2k-l(xq1)  u2k-l(xq2),,,u2k-l(xq1),,*u2k-l  (xq1> 


(See  Karlin  and  Studden,  1966,  page  6.)  In  this  section  a  representation  is 
obtained  for  balayages  defined  in  terms  of  V-convex  functions  which  is  akin  to 
the  (Hardy-Littlewood-Polya-Blackwell-Stein-Sherman-Cartier-Fell-Meyer- 
Strassen)  representation  for  dilations  (see  Strassen,  1965,  Theorems  2  and  8). 

To  state  the  representation  theorem  requires  the  following  notation. 

Let  F  -  (fs  f  is  U-concave  on  I}.  Note  that  since  U  is  a  T-system  of 
continuous  functions,  any  f  e  F  is  continuous.  Furthermore,  when  u  is 
an  BT-system  with  k  >  1,  f  e  F  is  differentiable.  (Theorems  B  and  D  on  pages 
248  and  249  of  Roberts  and  Varberg,  1973,  or  Theorem  3.4  on  page  188  of  Karlin, 
1968). 


For  x  e  I*  and  £  a  real  valued  function  cm  I,  let 
?(x)  -  (£(x1)  +...+  f(x^))/k.  Let  B  and  B*  denote  the  Borel  subsets  of  R 

|r 

and  R  ,  respectively.  For  v  a  probability  measure  (p.m.)  on  (R,B),  let 
S(v)  ■  {x:  v((x-e,x+e))  >  0  for  every  c  >  0}  denote  the  support  of  v. 

Note  that  S(v)  is  always  closed. 

;  the  following  conditions  are  imposed  in  the  representation  theorem: 

(cl)  v  and  X  are  two  p.m. 's  on  (R,B)  with  supports  contained  in  a 

compact  interval  K  c  i  and  satisfying 

(c2)  J  f.  A... A  f  dv  <  J  I.  A... A  1 nj  dX 
i  m  -  l  mi 

whenever  f.  e  F,  i*l, . .  .,m,nKL,2,. ..  . 

Below  P( • | • )  denotes  a  Markov  kernel  on  B  x  K  ,  i .e. ,  P( • | • )  is  a 

p.m.  on  (R,B)  for  each  x  e  K*1  and  P(A|*)  is  B  .(  ■  (Borel  subsets 

Kk 

V 

of  K  } )  measurable  for  each  A  e  B. 

Theorem  4.1.  under  condition  (cl),  (c2)  is  necessary  and  sufficient  for 

v(A)  •  J  P(A|x)  dX(xA)  for  every  A  c  B  where  P(  •  |  • )  is  a  Markov 

k  U  k 

kernel  on  B  x  K  with  P(«|x)  >  Fx  for  every  x  c  K  . 

The  proof  of  Theorem  4.1,  though  somewhat  involved,  is  really  along  the 
line  of  Strassen's  (1965)  proof  for  dilations.  Before  giving  the  proof  some 
further  quantities  need  to  be  defined  and  some  lemmas  need  to  be  stated  and 
proved. 

Let  K  be  a  compact  interval  contained  in  I.  Later  K  will  be  chosen 
to  contain  S(v)  and  S(X).  Let  D  denote  the  set  of  discrete  p.m.’s  cm 
(K,Bg)  with  at  most  k  mass  points. 

2k 

Let  M  denote  the  moment  space  (m  e  R  :  m^  »  /  u^dD,  j-l,...,2k,Dco). 


vjpnmwfutup. 
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Not*  that  Theorem  2.1  and  case  2  (ii)  on  pages  42  and  46,  respectively,  of 

Karlin  and  Studden  (1966)  guarantee  that  if  m  e  M  are  the  "moments"  of  a  p.m. 

with  support  contained  in  K,  then  there  is  a  (unique)  Dffl  c  D  with  moments  m. 

Consequently  M  is  convex  and,  since  the  u's  are  continuous,  it  is  easy  to 

see  that  N  is  compact. 

Let  f  c  C(K) .  For  m  e  K  and  -  x  c  Kk,  let  ' 

U 

l£(m)  -  sup{J  fty:  n  >  Dm,  S(X)  c  K} 

and 

U 

hf(x)  -  sup{J  fd »•.  n  >  Fx,  S(X)  c  k) 

where  Fx  is  the  empirical  distribution  of  the  sample  x^,x2,...,x.  .  Note 
that  in  the  definition  of  1£  and  h£,  »  is  a  p.m.  since  uQ  ■  1. 

Let  m(»)i  K*  -»  K  be  given  by  m^  ■  J  u^_1dFjc  -  £  u^_^(x^)A» 

Obviously  m(  • )  is  continuous  and  it  follows  from  the  definition  of  lf  and 
h£  that  h£(x)  -  lf(m(x)). 

In  Lemma  4.2,  the  relative  interior  of  H  refers  to  the  interior  of  N 
when  M  is  viewed  as  a  subset  of  the  smallest  affine  set  containing  it  (see 
Rockafellar,  1970). 

Lemma  4.2.  1£(  • )  is  concave  on  H,  and  consequently,  continuous  on  the 
relative  interior  of  M. 

Proof.  That  1£(*)  is  continuous  on  the  relative  interior  of  M  is 
immediate  from  Theorem  10.1  of  Rockafellar  (1970)  once  1£(  • )  is  shown  to  be 
concave  on  H. 

TO  do  this,  let  m^  and  c  R  (m^moj),  «  t  (0,1)  and  a  -  1-a.  Since  H 

U 

is  convex,  m^iam^  +  amjCH.  For  i-1  and  2,  let  Xi  >  Dm  and  let 
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Xj  -  a  Xj+  a  Xj.  That  1^  is  concave  on  M  will  follow  immediately  from  the 

U  U 

definition  of  lf  once  it  is  shown  that  ^3  >  Dm  •  Since  >  is  transitive, 

U 

it  suffices  to  show  that  D  ■  a  D  +  a  D  >  d  . 

?1  ?2  -3 


Case  1:  k-1.  Let  x  <  y  <  z  denote  the  three  mass  points  of  D  and  D 


nu 

-3 


and  let  9  be  U-convex.  To  avoid  trivialities,  assume  that  u^(x,y,z)  >  0. 
first  we  show  that  y  is  the  mass  point  corresponding  to  D  .  If  not 


assume  that  x  is  the  mass  point  corresponding  to  D 


53 


m, 

-3 


Then 


1  1  1 

‘  -1  ‘ 

0  ' 

u^x)  ux(y)  u^(  z ) 

a 

- 

0 

.  g(x)  g(y)  g(z)  . 

.  «  . 

.  c  . 

#  c  -  J  gd(D-D  ). 

-3 


1  1  1 

a  ' 

■  0  * 

Uj(x)  u^y)  UjU) 

-1 

- 

0 

.  g(x)  g(y)  g(z)  . 

a 

.  c  . 

By  Cramer's  rule,  -1  -  cxu<y,z)/hg(x,y,z)  and  0  <  a  ■  cxu(x,y)/ug(x,y,z) 

which  is  a  contradiction  since  u(x,y),  u(y,z)  and  Ug(x,y,z)  are  all  positive. 

Similarly  z  cannot  be  a  mass  point  of  D  . 

-3 

Since  y  is  the  mass  point  corresponding  to  D  , 

-3 


,  c  -  J  gd(D-D  ). 

-3 


Again  by  Cramer's  rule,  0  <  a  -  cxu(y,z)/Aig(x,y,z).  So,  0  <  c  since 

u(y,z)  and  u^(x,y,z)  are  both  positive. 

Case  2:  k  >  1.  Let  xA  e  K,  i-l,...,l  £  K  denote  the  mass  points  of 
D^.  If  1  <  k,  let  x^  e  K,  i-k-1+1 , . . . ,  k  be  chosen  so  that  x^,...^ 
are  all  distinct.  Let  y^  <  y2  <...<  y^  denote  the  ordered  x's. 

Let  g  be  U-convex.  Since  U  is  an  ET-system,  recall  that  g  is 
differentiable  and  u*(y1,y1,y2,y2»...*y|l,yjl)  >  0.  So  there  exists  a  polynomial 


1 
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P(x)  in  the  u’s  such  that  P(x£)  -  g(j;J  and  P'tx^  -  9f(xi)  for  i-l,...,k. 
By  Theorem  2.2  on  pages  282  and  283  of  Karlin  (1968),  g(x)  >  P(x)  on  K.  Thus, 
since  the  "moments"  of  D  agree  with  those  of  D  , 


;  gdD  >  ;  pdD  -  ;  PdD  -  ;  gdo  , 

-3  -3 

where  the  last  equality  follows  since  P  -  g 

If 

Lemma  4.3.  h£(x)  is  continuous  on  K 


on  the  S(D_  ). 

nu 

-3 

■  -  .  . 

with  h£  >  1. 


U 

SO,  D  >  D  . 
-3 


1 


Proof.  That  h£  >  ?  is  immediate  from  the  definition  of  h£. 

If 

Let  x  e  K  .  If  the  coordinates  of  x  are  all  distinct,  it  follows  from 
Theorem  2.1  on  page  42  of  Karlin  and  Studden  (1966)  that  m(x)  must  be  in  the 
relatively  interior  of  K.  Since  h£(x)  -  lf(m(x)),  it  follows  from  Lemma  4.2 
that  h£  is  continuous  at  x. 

Now  consider  the  case  when  at  least  two  coordinates  of  x  agree.  Let 

yj  <...<  y^,  1  <  k  denote  the  distinct  values  of  x^, . . . First  we  show 

U 

that  X  •  F  if  X  >  F  with  S(X)  c  k,  in  which  case,  hf(x)  -  I(x). 

Since  1  <  k,  by  Theorem  5.2  on  page  30  of  Karlin  and  Studden  there 

exists  a  nonnegative  polynomial  P(x)  in  the  u's  whose  only  zeroes  on  K  are 

yx»  * • • ,yr  Since  P(*)  is  both  U-concave  and  U- convex,  J  Pd(X-Fx)  -  0 
O 

whenever  X  >  Fx  and  S(X)  c  k.  But  this  implies  that  S(X)  c  (y^,...,y^). 

U 

So  X  e  D.  Since  the  "moments",  m,  uniquely  determine  Dm  and  X  >  Fx 
implies  that  J  u^dX  -  J  u^dFx  for  each  j,  X  -  T*. 

Let  xn  x  through  points  in  K^  as  n  ■*  •.  Now  we  will  show  that 

u 

h#(x,J  *♦  ?(*)  ■  h-(x).  Let  eK  4  0,  eR  >  0  as  n  -»  •.  Choose  u  >  F„  with 
z  —  z  —  n  n  n  y 

S<„n)  c  k  such  that  h£(xR)  $  J  fd pn  +  en.  Since  f  t  C(K),  it  suffices  to 
show  that  »n  ■*  Fx  in  distribution  since  then 


-  XSgqgCTxm  vg?.  'JfJuutaAa.  najta*"  «#.  t’jm-’juui’  kivmn  g  j.-,  un 
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I(x)  -  11a  I(xn)  $  lim  h£(xn) 

$  Hi  h£(xn)  $  lia  J  fdU/n  -  J  fdFx  -  ?(x). 

Since  K  is  compact,  Un)  is  tight.  Thus  to  show  that  un  ■*  Fx  in 
distribution  it  suffices  to  show  that  if  {uj  is  a  convergent  subsequence 

Ol 

u 

of  {^n) ,  say  converging  to  v,  then  fj  >  Fx> 

To  see  this,  let  g  be  U-convex.  Since  g  is  U- convex,  g  is  con- 

U 

tinuous  on  K.  Since  u  >  F„  , 

■  *m 

X  gdfo  -  lim  J  gd^m  l  lim  J  gdFx  -  J  gdFx. 

—a  — 

u 

So  /j  >  F  .  $ 

Now  let  K  denote  the  set  of  functions  on  K*  which  are  unifora  limits 

on  K*1  of  functions  of  the  form  A  ?2  A... A  ?m,  f£  e  F,  i»l,...,m,  wl,2,.... 

For  two  p.m.'s  A  and  v  on  (Kk,B  .  )  write  v  >  A  if  J  fdv  $  J  fdA 

K*  ~ 

for  every  fen,  i.e.,  y  is  a  balayage  of  A  under  >.  Let  $x  denote 
the  p.m.  which  is  degenerate  at  x. 

The  following  lemma  characterizes  K  in  terms  of  balayages  of  &x> 

The  proof  is  the  same  as  the  proof  of  Theorem  47  on  page  240  of  Meyer  (1966). 
As  a  corollary,  we  get  that  h£  e  K. 

Lemma  4.4.  Let  f  e  COt*).  Then  f  t  K  if  and  only  if  J  fdA  $  f(x) 
whenever  A  >  6X,  x  e  K*. 

Corollary  4.5.  h£  e  K. 

Proof.  Note  that  h£  e  C (K*)  by  Lemma  4.3.  So,  by  Lemma  4.4,  it 
suffices  to  show  that  /  h^dA  <  h£(x)  whenever  A  >  &x  and  x  e  K*. 

Since  K  is  compact,  it  is  easy  to  see  that  h£(x)  is  a  support  function 


on  C(K)  (i.e. ,  subadditive  and  nonnegative  homogeneous  as  a  function  in  f) 
for  each  x  satisfying  the  conditions  of  Strassen's  (1965)  Theorem  1 
(Theorem  51  on  page  244  of  Meyer,  1966).  Thus  h(f)  -  J  hfdX  is  the  support 
function  of  p.m.  's  of  the  form 

U 

(4.1)  v(A)  -  X  P(A|y)dX(y)  where  P(*|y)  >  is  a 
Markov  kernel  on  x  K^. 

If  g  e  r,  then  g  e  K  and  it  is  immediate  from  (4.1)  that,  for  such  a  v, 

/  gdv  •  XX  g(z)P(dz|y)dX(y)  <  /  g(y)dX(y)  <  g(x)  -  /  gdFx  whenever  X  >  «x  . 
o 

So,  v  >  Fx.  Thus,  since  h(f)  -  sup{/  fdv:  v  is  of  the  form  (4.1)}  (by  the 
Hahn-Banach  Theorem  -  see  (5)  on  page  424  of  Strassen,  1964),  h(f)  £  hf(x)4| 

k  k 

For  a  p.m.  v  on  (R,B)  let  be  the  p.m.  on  (R  ,B  )  defined  by 

»q(B^  x...x  B^)  -  vtBj^  n  B2  n...n  Bk)  for  Bi  e  B,  i-l,...,k.  in  other  words, 
Vq  is  just  the  p.m.  which  is  concentrated  on  the  diagonal  of  R  and  having 
univariate  marginals  v. 

Lemma  4.6.  Let  v  and  X  be  two  p.m.'s  on  (R,B)  with  S(v)  U  S(X)  c  k. 
Then  condition  (c2)  is  equivalent  to 

(c2f)  J  fdvp  <;  X  f  n!£  dX  for  every  f  e  K. 

Proof.  From  the  definition  of  vQ  it  is  clear  that 

(4.2)  X  A. ..A  lm  dvg'-  X  A. ..A  fffl  dv 

<  X  a...a  nj  dx 

for  f^  c  P,  i-l,...,m,  m-1,2,...  is  equivalent  to  (c2).  The  equivalence 
of  (c2)  and  (c2')  follows  from  (4.2)  since  f  e  K  is  the  uniform  limit  of 
functions  of  the  form  A. ..A  IB.  $ 
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Proof  of  Theorem  4.3.  (Necessity)  Let  f^  e  F,  i«l,...,m.  Then,  since 

u 

p(*|*)  >  Fx*  X  fi(y)p(<3y|*)  i  IjC*)#  and  so, 

X  fj  A... A  fn(y)P(dy|x)  <  ?!  a... a  ?m(x). 

Thus, 

/  fx  A... A  fm  dv  -  U  A... A  fm(y)P(dy|x)  nj  dX(x.) 

i  /  fx  A... A  ?m(x)  nj  dX(xA). 

(Sufficiency)  Let  f  e  C(K).  Then,  by  Lemma  4.3,  Corollary  4.5, 

Leona  4.6  and  the  definition  of  v  , 

X  f dv  -  /  ?dv0  <  J  <  I  hf(x)  nj  d\(xi). 

This  with  Theorem  1  of  Strassen  (1965)  gives  the  representation  of  v  in 

U 

terms  of  X  and  a  Markov  kernel  P(*|x)  >  Fx*  1 

.  u 

Remark  4.7.  Let  v(A)  -  J  P(A(x)  dXfx^).  Then  P(*|x)  >  Fx  for  all 

x  is  equivalent  to  E(J  fdFy|x)  J  fdPx  where  x^^,...,^  are  i.i.d.  X 

and,  given  X  »  x,  , . . . ,Y^  are  i.i.d.  PC •  |x) •  When  D  is  an  ET-system 

an  argument  like  that  in  Case  2  of  Lemma  4.2  shows  that  this  is  equivalent  to 
the  martingale  type  of  formula  E(J  UjdFy|x)  -  J  UjdFx,  j-0,...,2k-l.  For  the 
classical  ET-system  Uj(x)  -  x-^,  an  apt  name  for  a  sequence  X^Xj, . . .  of 
random  k-vectors  satisfying 

E(J  x^dPv  lx  , . . . ,X. )  ■  J  x^dFv  for  j-0,...,  2k-l  is  a  k-mart 
-n+1  "n  ~x  -n 

sequence.  Theorem  4.1  characterizes  the  marginal  p.m.'s  that  can  correspond 
to  a  k-«art  sequence. 
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